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PREFACE 

The  ASSEMBLER  described  in  the  following  pages  is  one  which 
will  assemble  programs  written  in  assembly  language  for  small  machines, 
one  or  two  addresses,  having  the  feature  of  indirect  addressing  but 
not  having  index  registers  and  the  base-displacement  way  of  addressing, 
having  up  to  U8  bit  words  (l6  octal  numbers),  and  permitting  double 
relocation.   Macros  are  not  allowed. 

It  assembles  according  to  the  description  of  the  particular 
machine  that  it  reads.   In  this  sense  it  is  a  "general"  assembler. 

The  author  wishes  to  thank  Professor  C.  W.  Gear  for  his 
valued  guidance  and  support  during  the  preparation  of  this  thesis. 
The  technical  assistance  provided  by  Professor  Gear  is  especially 
appreciated.   Thanks  are  also  extended  to  Miss  Barbara  Hurdle  for 
typing  the  final  manuscript. 
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1.   INTRODUCTION 

In  the  following  sections  the  ASSEMBLER  is  described,  which 
is  a  collection  of  programs  written  in  assembler  (F  or  G)  and  operating 
on  the  IBM  System  360/50-75.   It  assembles  programs  for  the  Digital 
Equipment  Corporation's  PDP-7,  PDP-8,  and  PDP-9,  computers. 

This  collection  of  programs  is  general  and  flexible,  not 
depending  upon  a  particular  machine,  but  it  accepts  a  description  of 
it,  according  to  which,  programs  written  in  assembly  language  for  that 
particular  machine  are  decoded  and  the  output  produced  is  loaded  into 
the  machine.   After  a  program  has  been  assembled  it  may  be  punched  on 
cards  or  paper  tape  or  may  be  saved  in  a  file. 


2.   ASSEMBLY 

An  assembler  transforms  the  symbolic  language  source 
programs  into  machine  language.   In  the  case  of  PDP-8,  every  machine 
instruction  occupies  exactly  one  location  in  its  memory.   The  assembly 
language  program  is  a  sequence  of  input  lines  to  the  assembler  which 
specifies  these  instructions  in  a  symbolic  form.   The  assembler  reads 
these  lines,  decodes  them,  and  constructs  or  assembles  the  corresponding 
binary  words  for  the  specific  computer. 

Symbolic  names  for  the  memory  locations  are  defined  by  their 
appearance  at  the  beginning,  in  most  assemblers,  of  an  input  line. 
Symbolic  names  for  operation  codes  appear  next,  sometimes  followed  by 
operands.   Then  comments  may  or  may  not  follow. 

The  assembler  lists  a  value  corresponding  to  the  value  of 
the  operator,  augmented  by  the  value  of  the  operand.   Each  such  value 
is  associated  with  an  address  by  means  of  the  program  counter  (PC).   The 
PC  contains  a  value  which  is  incremented  after  each  word  is  generated. 
So  normally  assembled  words  are  placed  in  serially  ascending  locations 
in  memory.   Some  input  lines  will  not  generate  words  but  are  instructions 
to  the  assembler,  for  example  the  pseudo's  END,  ORG,  DC,  etc.   The 
symbolic  information  on  each  assembly  language  line  is  grouped  into  four 
fields:   the  location  name  or  label,  the  operation  code  or  mnemonic,  the 
address  field  or  operand,  and  comment  fields.   These  fields  are  usually 
delimited  by  blanks.   The  ASSEMBLER  assumes  blanks  as  delimiters.   If 
the  user  wishes  to  use  a  different  symbol,  he  has  to  define  it  in  the 
tables  that  are  used  for  the  translation  of  the  particular  field.   The 
fields  are  described  in  the  description  of  the  particular  machine. 


The  location  name,  usually,  starts  at  character  1  and  is 

terminated  by  the  first  blank.   We  say  usually  because  even  the  case 

of  a  machine  having  first  the  operation  code,  next  the  location,  and 

next  the  address  or  having  an  asterisk  (*)  for  a  delimiter,  could  be 

handled  provided  that  the  user  describes  it  so  to  the  ASSEMBLER.   If 

the  location  name  is  non-empty  it  may  contain  a  name  of  up  to  eight 

characters,  beginning  with  a  letter.   Any  variable  used  in  the  program 

must  be  defined  by  its  appearance  in  the  location  field.   The  variables 

used  with  some  pseudo's,  i.e.  EQU,  must  be  predefined,  that  is,  defined 

at  some  point  before  the  pseudo  is  processed.   For  instance 

A  EQU  B+l 
B  EQU  10 


is  illegal,  whereas 


B  EQU  10 
A  EQU  B+l 

is  allowed.   As  another  example 

A  EQU  B+10 
B  EQU  2*A-30 

does  not  define  B  as  10  and  A  as  20.   The  operation  code  field,  or 

mnemonic  field,  is  the  expression  starting  with  the  first  non-blank 

character  after  the  location  field  and  ending  with  the  next  blank.   Any 

variable  appearing  in  the  operation  code  field  must  be  an  operation 

code.   In  the  particular  case  of  the  PDP  computers,  if  the  operation 

code  is  a  microinstruction  then  the  address  field  or  operand  field  is 

empty.   Otherwise  it  starts  with  the  first  non-blank  character  after 

the  mnemonic  and  ends  with  the  next  blank.   Any  variable  appearing  in 

the  address  field  must  be  a  label. 


The  above  three  fields  may  total  up  to  72  characters.   The 
comment  field  starts  at  the  end  of  the  address  field  and  may  extend 
up  to  the  80th  character.   Comments  after  the  address  field  may  or 
may  not  be  preceded  by  the  particular  comment  character,  in  the  case 
of  PDP-8,  the  character  slash  (/),  in  others,  asterisk  (*). 

The  comment  has  no  effect  on  the  binary  output  of  the 
assembler;  it  is  only  copied  on  the  assembly  listing,  being  very 
useful  to  the  programmer  so  as  to  know  what  he  is  trying  to  perform 
with  a  particular  instruction  or  group  of  instructions  as  well  as  to 
the  other  programmers  who  might  want  to  use  this  program.   The  end  of 
the  program  is  sensed  by  the  END  pseudo.   In  the  case  of  Load-and-Go 
assembly,  the  last  instruction  executed  will  be  a  transfer  to  the 
address  evaluated  from  the  END  pseudo.   When  the  output  of  the  assembler 
is  input  to  the  loader  then  this  address  is  placed  in  a  suitable 
position  on  a  transfer  card  image. 

The  assembler  provides  two  kinds  of  output:   the  binary 
object  "deck",  and  the  assembly  listing.   The  former  is  a  list  of  the 
machine  program  in  a  form  acceptable  by  the  loader  of  the  particular 
computer  and  the  latter  helps  the  programmer  to  debug  his  program  from 
certain  possible  programming  errors. 

The  ASSEMBLER  is  a  typical  two-pass  one,  where  the  first 
pass  is  used  to  produce  a  table  of  all  symbolic  addresses  used  and 
their  address  values,  whereas  the  second  is  used  to  substitute  these 
values  into  the  original  symbolic  form  to  get  the  binary  form. 


3.   PASS  I 

To  construct  the  table  each  input  line  of  code  is  read,  one 
at  a  time.   If  a  name  appears  in  the  location  field,  then  it  is  put 
into  the  table.   The  assembler  assumes  that  the  first  instruction  to 
be  read  is  placed  into  location  zero,  the  next  into  location  one,  and 
so  on.   In  order  to  know  what  address  value  to  associate  with  a  name  in 
the  location  field,  the  location  counter  keeps  an  account  of  the  space 
used  and  it  is  incremented  after  each  instruction  has  been  handled. 

Each  entry  in  the  table  consists  of  the  name,  the  associated 
address  value,  a  pointer  pointing  to  the  left,  one  pointing  to  the 
right,  and  information  on  relocation,  whereby  left  and  right  we  mean 
alphabetically  smaller  and  bigger  entries  respectively.   Since  a  check 
has  to  be  done  for  double  definition  of  names  during  Pass  I,  it  is 
necessary  to  determine  if  the  name  just  read  in  the  location  field  is 
already  present  in  the  table.   This  involves  some  sort  of  table  look-up 
procedure  and  the  binary  tree  method  has  been  chosen  for  this  purpose. 

The  search  mechanism  consists  of  comparing  the  desired  name 
with  the  entry  in  the  fixed  position  reserved  for  the  middle  entry,  as 
in  the  binary  search.   If  the  desired  name  matches,  there  is  nothing 
further  to  do,  otherwise  the  search  continues  to  the  entry  pointed  to 
by  the  appropriate  of  the  two  pointers  (chain  addresses).   The  table 
built  is  a  tree  with  two  branches  at  every  node,  and  nodes  are  labeled 
by  a  table  entry.   Storing  the  information  this  way  for  a  binary  search 
has  the  advantage  that  it  is  no  longer  necessary  that  the  entry  at  the 
start,  or  root,  of  the  table  must  be  the  middle  entry.   Whichever  entry 


is  used,  it  is  only  necessary  that  all  other  entries  be  to  the  left 
or  right  of  it,  depending  on  whether  they  are  smaller  or  larger  than  it. 
It  has  the  advantage,  also,  that  names  can  be  added  to  the  table  at 
any  time.   If  a  new  name  is  to  be  entered,  a  search  process  is  followed 
which  will  finally  come  to  the  end  of  a  chain,  indicated  by  a  zero  link 
address.   If  the  name  is  found  in  the  search  then  it  obviously  should 
not  be  re-entered.   When  the  end  of  a  chain  is  reached,  the  new  entry 
can  be  added  at  that  point  and  the  appropriate  link  established.   This 
method  has  the  additional  advantage  that  it  is  easy  to  print  the  table 
in  alphabetical  order  when  the  time  comes.   It  has  the  disadvantage  that 
if  the  names  entered  are  in  order,  then  the  tree  is  one-sided.   It  is 
then  as  slow  as  a  sequential  search,  but  takes  up  more  space  because 
of  the  pointers. 

If  the  input  lines  were  containing  only  instructions,  then 
the  simple  mechanism  of  incrementing  the  location  counter  by  one  for 
each  line  would  be  sufficient.   However,  there  has  to  be  one  pseudo 
instruction  in  any  code,  the  END  pseudo  which  tells  the  assembler  that 
the  whole  deck  has  been  read  so  that  the  next  pass  can  begin.   Of  course 
there  are  many  other  pseudo' s.   So  in  order  that  Pass  I  recognize  the 
difference  between  the  instruction  and  pseudo  orders,  it  must  examine 
the  mnemonic  coding.   The  mnemonic  is  a  string  of  characters  similar  to 
a  symbolic  name,  so  that  similar  techniques  are  used  to  handle  mnemonics. 
In  this  case,  the  table  of  mnemonics  has  been  set  up  by  the  user;  they 
are  the  first  cards  of  the  description  of  the  particular  machine  that 
are  read  and  stored  using  the  same  technique  of  the  binary  tree,  as  with 
the  names.   Each  entry  consists  of  the  mnemonic,  the  actual  code,  the 


left  and  right  pointers,  and  a  branch  address  which  gives  the  address 
of  the  code  used  to  handle  the  mnemonic  or  speudo  order. 

The  typical  flow  of  Pass  I  is  the  following.   The  input  line 
of  code  is  read  (if  it  is  a  comment  no  particular  action  is  taken,  so 
we  will  not  mention  comments)  and  the  mnemonic  extracted.   The  mnemonic 
code  is  looked-up  in  the  mnemonic  table.   If  it  is  not  present  and 
macros  are  not  allowed,  there  is  an  error;  if  it  is  present,  a  branch 
is  made  to  the  address  found  in  the  table.   Then  the  appropriate 
section  of  code  takes  care  of  the  rest  of  the  line.   When  the  input 
line  which  is  usually  a  card  image  from  magnetic  tape  or  disk  file  has 
been  read  into  the  program  area  of  memory  and  a  copy  has  been  produced 
for  later  use  in  Pass  I,  the  assembler  must  extract  various  fields 
from  it  in  order  to  form  names,  mnemonics,  and  addresses.   The  positions 
and  lengths  of  these  fields  are  described  to  the  ASSEMBLER  by  the  user. 
The  various  fields  are  then  handled  as  follows . 

Location  field  -  This  may  be  a  fixed  length  field.   Extract 
characters  one  at  a  time.   If  the  first  non-blank  character  is  other 
than  an  alphabetic  character,  it  is  not  normally  an  allowed  name.   Otherwise 
subsequent  characters  must  be  alphabetic,  numeric,  or  blank.   A  blank 
indicates  the  end  of  the  name  in  which  case  the  field  should  contain  no 
more  non-blank  characters.   In  other  words,  blanks  are  not  allowed  in 
the  names.   If  the  user  wants  a  delimiter  other  than  the  blank,  he  has 
to  describe  it  in  the  table  used  for  the  translation  and  testing  of  the 
characters  making  the  location  field. 

Mnemonic  field  -  This  may  also  be  a  fixed  length  field,  except 
that  it  is  required  that  it  start  in  the  first  column  of  the  field.   A 
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similar  program  to  the  one  above  is  used  except  that  the  first  character 
must  be  a  non-blank  character.   There  is  no  logical  reason  to  restrict 
mnemonics  to  start  with  an  alphabetic  character  and  contain  only 
alphanumeric  characters,  but  they  frequently  are  so  restricted  which 
makes  convenient  the  use  of  the  same  reading  procedures. 

Address  field  -  This  field  may  differ  from  instruction  to 
instruction,  in  many  cases  containing  a  number  of  subfields  separated 
by  commas.   Within  each  subfield  the  address  can  be  expressions 
involving  names  and  numbers  and  some  of  the  arithmetic  operators  such 
as  plus  (+),  minus  (-),  and  multiply  (*).   We  will  come  back  to  names 
and  expressions  later.   The  first  step  in  the  process  of  decomposing 
such  a  subfield  is  to  break  it  into  separate  elements  such  as  names, 
operators,  numbers »  etc.   There  are  many  ways  to  do  this.   It  is,  however, 
faster  to  perform  a  lexicographic  scan  of  the  field  first.   By  a 
lexicographic  scan  we  mean  an  analysis  that  only  concerns  itself  with 
each  character  one  at  a  time,  taking  into  account  the  immediate  neighbor 
of  the  character. 

The  recognition  of  the  elements  of  the  subfield  is  performed 
very  simply  by  scanning  from  left  to  right,  and  noting  the  following. 

Names  start  with  a  letter  and  contain  letters  or  digits. 

Numbers  start  with  a  digit  and  contain  only  digits.   Starting 
from  the  left,  the  next  character  is  examined.   If  it  is  a  letter,  then 
a  name  is  recognized.   A  subscanner  for  name  recognition  examines 
consecutive  characters  until  a  non-alphanumeric  character  is  read.   This 
signals  the  end  of  the  name.   Since  names  are  restricted  to  a  maximum 
length,  a  check  is  made  for  excessive  characters.   After  the  string  of 


characters  representing  the  name  has  been  scanned,  control  returns  to 
the  basic  recognizer.  The  next  character  is  examined  and  a  branch  to 
a  basic  recognizer  for  numbers,  names,  or  operators  is  made. 

The  recognition  process  for  names,  numbers,  etc.,  involves 
more  than  just  checking  for  the  existence  of  the  name,  number,  etc. 
Something  meaningful  has  to  be  done  with  the  address .   Although  the 
address  fields  of  instructions  need  not  be  translated  until  Pass  II, 
the  address  fields  of  some  pseudo  orders  affecting  the  location  counter 
will  have  to  be  converted  to  numbers  in  Pass  I.   This  means  that  the 
characters  in  a  name  will  have  to  be  packed  together  in  a  form  suitable 
for  the  table  look-up  process,  that  decimal  digits  will  have  to  be 
converted  into  binary  integers,  and  that  the  calculation  indicated  will 
have  to  be  performed  between  the  operands. 

This  particular  kind  of  action  taken  for  certain  pseudo  orders 
is  realized  by  branching  to  the  appropriate  address  going  with  every 
mnemonic  or  pseudo.   In  the  case  that  the  expression  will  have  to  be 
evaluated  in  Pass  I,  normally  it  has  to  be  well  defined,  that  is,  all 
of  the  names  appearing  in  this  expression  must  be  previously  defined 
in  the  name  table. 
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k.      PASS  II 

The  purpose  of  Pass  II  of  the  assembler  is  to  convert  the 
source  language  into  binary,  using  the  name  table  constructed  in 
Pass  I  to  convert  the  addresses  and  the  mnemonic  table  to  convert  the 
instructions.   To  do  this,  a  copy  of  the  source  program  is  read,  a  line 
at  a  time,  and  many  of  the  steps  of  Pass  I  repeated. 

The  location  field  is  ignored  because  it  was  completely 
handled  in  Pass  I. 

The  mnemonic  field  is  examined  and  a  table  look-up  performed. 
In  this  pass,  the  mnemonic  table  provides  both  a  branch  address  for 
instructions  or  pseudo  orders  and  in  the  case  of  instructions,  the 
binary  code. 

The  address  subfields  are  converted  into  binary  numbers  for 
packing  into  the  instructions  or  use  in  pseudo  orders.   The  code  in 
the  Pass  I  handling  of  pseudo  order  addresses  is  re-used  for  this 
process.   A  location  counter  is  maintained  in  an  identical  manner  to 
Pass  I.   Pass  II  is  also  terminated  when  the  END  pseudo  order  is  read. 
The  END  pseudo  can  involve  an  address  field  which  is  used  to  provide  a 
starting  address  at  execution  time.   In  the  case  of  a  Load-and-Go 
assembler,  the  last  instruction  executed  would  be  a  transfer  to  the 
address  evaluated  from  the  END  pseudo.   In  our  case  the  output  of  the 
ASSEMBLER  is  input  to  a  loader,  this  address  is  placed  in  a  suitable 
position  on  a  binary  card  image. 
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5.   NAMES  AND  EXPRESSIONS 

We  have  mentioned  names  and  expressions  in  the  process  of 
decomposing  the  address  field.   A  name  is  a  symbol  which  stands  for  a 
numeric  value.   It  may  stand  for  a  self-defining  value,  called  a 
constant;  or  it  may  stand  for  a  value  which  is  defined  elsewhere,  a 
variable.   A  variable  may  be  an  operation  code  or  a  pseudo  order,  in 
which  case  it  is  defined  from  the  mnemonic  table  read  and  built  in 
the  description  of  the  particular  machine  or  it  may  be  a  label,  in 
which  case  it  is  defined  by  its  appearance  in  the  location  field  of 
some  input  line.   If  this  line  corresponds  to  a  memory  location,  then 
the  defined  value  of  the  label  is  the  address  of  this  location.   If 
the  operation  field  of  the  line  is  the  pseudo  EQU  or  DC,  the  defined 
value  of  the  label  is  the  value  of  the  expression  in  the  operand 
(address)  field.   There  is  a  special  name  which  is  self-defining.   Its 
value  is  the  current  contents  of  the  location  counter.   This  special 
name  is  given  in  the  description.   It  may  be  the  dot  ( • )  as  in  the 
case  of  the  PDP  series  or  the  asterisk  (*)  as  in  the  case  of  the 
IBM  360,  for  example. 

The  following  EBCDIC  characters  may  be  used  in  the  formation 
of  names  and  expressions. 

Alphabetic:   Upper  case  letters  A-Z 

Numeric:     Digits  0-9 

Operators:   +  -  *  (plus,  minus,  multiply) 

Delimiters:   Blanks  assumed  unless  otherwise  specified. 

Special  character  for  comment  field  as  specified 
in  the  description. 
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Names  must  be  up  to  eight  characters  long.   Variables  may  contain 
alphabetic  and  numeric  characters,  but  they  must  start  with  an 
alphabetic  character.   Constants  contain  only  digits.   An  expression 
is  a  sequence  of  names  separated  by  the  operators  +,  -,  and  *,  and 
delimited  by  blanks.   In  the  mnemonic  field,  all  variables  must  be 
operation  codes  or  pseudo  orders.   In  the  address  field  (operand 
field)  all  variables  must  be  labels.   The  assembler  evaluates  the 
expression  from  left  to  right  by  combining  the  values  of  the  names 
according  to  the  operators.   The  most  general  form  of  an  expression 
in  the  address  field  is 

N*A±B 
where  N  is  an  integer  and  A,  B  are  names,  absolute  or  relocatable. 
The  assembler  produces  relocation  bits  with  each  address ,  which  tell 
the  loader  whether  or  not  relocation  is  to  be  applied  to  that  particular 
address.   In  addition  to  the  value  of  the  name,  the  name  table  contains 
an  entry  which  provides  information  indicating  whether  or  not  a  name 
is  relocatable.   Any  name  appearing  in  the  location  field  of  an 
instruction  is  relocatable,  as  are  names  in  the  location  fields  of 
certain  pseudo  orders.   The  pseudo  order  which  can  define  a  non-relocatable 
name  is  the  EQU  pseudo. 

An  absolute  (non-relocatable)  address  can  be  constructed  from 
any  allowable  expression  involving  numbers  and  absolute  valued  symbolic 
addresses.   For  example,  20*3-^*A+5  is  a  valid  absolute  address  if  A 
is  an  absolute  name.   A  relocatable  address  can  be  constructed  by  adding 
or  subtracting  any  absolute  amount  to  a  relocatable  name.   For  example, 
if  B  is  a  relocatable  name  then  B-k   is  also  relocatable,  as  it  is  B+20. 
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There  are  cases  where  an  expression  like  B-A+l  is  needed  and  we  would 
like  to  arrange  that  the  difference  of  two  relocatable  address  expressions 
is  an  absolute  expression.   In  this  way,  the  expression  A-B+C  would  be 
legal  unless  A  and  C  were  relocatable  and  B  absolute.   Although  the 
address  A+C  may  be  needed  by  the  programmer  in  some  cases  where  both 
A  and  C  are  relocatable,  it  is  not  possible  for  the  loader  to  handle 
it  with  only  one  relocation  bit.   With  only  one  bit,  the  loader  can 
only  apply  either  single  relocation  or  none.   Even  with  single 
relocation  it  is  possible  to  allow  expressions  such  as  2*A-B  where 
both  A  and  B  are  relocatable,  since  the  total  relocation  is  still 
single.   The  ASSEMBLER  assuming  two  bits,  and  handling  double,  single, 
or  no  relocation,  restricts  the  general  expression  N*A±B  to  take  values 
as  in  the  following  table,  producing  the  corresponding  relocation 

Table  1. 
RELOCATION  RESULTING  FROM  THE  EXPRESSION  N*A±B 

Relocation 

Double 

Single  or  Double 

Single 

No  Relocation 

No  Relocation,  Single  or  Double 

Single  or  Double 

Single 

No  Relocation 


A 

B 

OP 

N 

Rel 

Rel 

+ 

<_  1 

Rel 

Abs 

+ 

1  2 

Abs 

Rel 

+ 

"any" 

Abs 

Abs 

+ 

"any" 

Rel 

Rel 

- 

1  3 

Rel 

Abs 

- 

1  2 

Abs 

Rel 

- 

"any" 

Abs 

Abs 

- 

"any" 

11+ 


The  restrictions  on  the  integer  N  are  imposed  by  the  requirement  of 
at  most  double  relocation.   The  value  of  N  "any"  is  such  that  the 
limitations  of  the  particular  computer  are  not  exceeded.   When  an 
expression  is  evaluated  a  check  is  made  to  find  whether  or  not  the 
value  of  the  expression  is  within  the  current  memory  core  block 
referred  to  as  "page."   If  it  is  then  the  same-page  bit  of  the 
assembled  instruction  is  set  to  one.   If  this  bit  is  zero  any  location 
in  "page"  zero  can  be  addressed  directly  from  any  page  of  core  memory. 
All  other  core  memory  locations  can  be  addressed  indirectly  by  setting 
the  indirect-bit.   The  rest  of  the  bits  specify  the  location  in  the 
current  "page"  or  "page"  zero,  which  contains  the  full  absolute 
address  of  the  operand.   Indirect  addressing  is  sensed  by  the  presence 
of  a  special  character,  "I"  in  the  case  of  the  PDP-8,  preceded  and 
followed  by  blank.   This  special  character  is  given  in  the  description 
of  the  particular  computer. 
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6.   PSEUDO  ORDERS 

Pseudo  orders  are  operation  codes  which  do  not  represent 
actual  machine  instructions,  but  are  simply  signals  to  the  assembler 
to  take  certain  action.   Pseudo' s  provided  to  the  ASSEMBLER  together 
with  their  effect  are  given  below. 
Data  Loading: 
DC  -  Define  Constant 

Define  the  optional  symbol  in  the  location  field  to  have  a  value 
equal  to  the  current  contents  of  the  location  counter.   Then 
substitute  the  value  of  the  expression  in  the  address  field  for 
the  memory  location  signified  by  the  current  contents  of  the 
location  counter.   It  is  necessary  to  determine  how  many  words 
of  storage  will  be  occupied  by  the  data  given  in  the  pseudo,  so 
that  the  location  counter  can  be  incremented  accordingly  during 
Pass  I.   In  scanning  the  field,  the  first  character  determines 
the  type  of  dield  following.   It  may  be  preceded  by  a  repetition 
factor.   For  some  characters,  an  L  may  follow  with  a  length 
specification.   Finally  the  data  appears  inside  quotation  marks 
CI.   During  Pass  I  the  program  determines  the  boundary  alignment 
of  each  field  in  the  DC  in  order  to  calculate  the  location  counter 
change.   During  Pass  II  the  location  field  is  ignored,  but  the 
address  field  is  converted  into  binary.   At  the  same  time  the 
location  counter  is  increased  once  for  each  word  produced. 
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Location  Counter  Control: 

ORG  -  Set  the  location  counter  to  a  specific  quantity. 

Sets  the  location  counter  to  the  value  specified  in  the  address 
field,  in  Pass  I  and  Pass  II,  so  that  the  next  instruction  read 
will  be  loaded  in  this  value.   If  a  name  appears  in  the  location 
field,  then  it  is  put  into  the  name  table  after  the  location 
counter  has  been  changed  and  given  a  value  equal  to  the  new 
contents  of  the  location  counter. 

BSS  -  Block  Started  by  Symbol 

Any  name  in  the  location  field  must  be  entered  by  the  name  table 
before  the  location  counter  is  incremented. 

BTS  -  Block  Terminated  by  Symbol 

Any  name  in  the  location  field  must  be  equated  to  the  location 
of  the  last  word  of  the  block,  that  is  to  one  less  than  the 
contents  of  the  location  counter  after  it  has  been  incremented. 
As  long  as  the  addresses  in  the  address  field  are  purely  numeric, 
there  are  no  problems.   However,  if  a  symbolic  address  is 
involved,  then  a  value  has  to  be  assigned  to  it,  namely  it  has 
to  be  predefined  in  order  that  the  numeric  value  of  the  address 
can  be  calculated.   In  Pass  I,  only  those  names  that  appeared 
before  the  line  being  currently  examined  are  in  the  name  table 
with  numeric  values.   Therefore,  names  must  be  defined  before 
they  are  used  in  any  pseudo  orders  that  affect  the  location 
counter  in  a  manner  dependent  on  their  address  field. 
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DS  -  Define  Storage 

Define  the  optional  symbol  in  the  location  field  to  have  a 
value  equal  to  the  current  contents  of  the  location  counter. 
Then  add  the  value  of  the  expression  (predefined)  in  the 
address  field  to  the  contents  of  the  location  counter.   Ds  is 
similar  to  DC  and  the  same  piece  of  code  is  used  in  both  passes, 
the  only  difference  is  that  DS  does  not  actually  produce  object, 
■whereas  the  DC  must  have  the  data  specified  in  the  address  field. 
Name  Table  Entry: 
EQU  -  Symbolic  Equivalence 

Define  the  name  in  the  location  field  to  have  a  value  equal  to 
that  of  the  expression  (predefined)  in  the  address  field. 
Others: 
END  -  End  Assembly 

Define  the  optional  symbol  in  the  location  name  to  have  a 
value  equal  to  the  current  contents  of  the  location  counter. 
If  the  address  field  is  non-empty,  then  its  value  will  be 
punched  on  a  binary  transfer  card  as  the  starting  address 
of  the  program. 
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7.   DESCRIPTION  OF  THE  MACHINE 

This  is  the  part  of  the  program  which  makes  the  ASSEMBLER 
work  for  many  types  of  machines.   At  the  end  of  the  ASSEMBLER  and 
before  the  END  pseudo  which  signals  the  end  of  the  process,  the  user 
has  to  include  a  small  deck  of  cards  which  define  to  the  program  his 
particular  machine.   On  each  card  there  are  three  fields  punched 
starting  at  columns  1,  10,  and  l6,  respectively.   The  first  field  is 
the  name  used  for  this  particular  piece  of  information  through  the 
program.   In  this  case,  in  order  to  define  another  machine,  only 
this  small  deck  of  the  description  will  have  to  be  changed,  with  the 
names  in  the  program  and  the  description  remaining  the  same.   The 
second  field  defines  the  first  field  as  a  constant  or  equivalent  to 
the  third  field.   The  second  field,  also,  will  remain  unchanged  for 
defining  another  machine.   The  third  field  is  the  actual  description 
and  it  is  the  one  which  changes  when  the  machine  changes.   For  example 
if  the  character  slash  (./ )  is  used  to  specify  "comment"  for  one  user 
and  the  character  semicolon  (;)  for  a  second  then  the  third  field  will 
be  C'/'  f°r   the  first  user  and  C';*  for  the  second,  the  C  standing  for 
character  and  the  actual  character  following,  included  in  quote  marks. 
Also  if  the  mnemonic  starts  at  column  6  in  the  program  of  one  used  and 
at  column  10  in  the  program  of  a  second,  the  third  field  will  be  6  for 
the  first  and  10  for  the  second.   Similarly  if  the  character  I  signifies 
indirect  addressing  for  the  one  and  the  asterisk  (*)  for  the  other,  the 
third  field  will  be  CL2'I'  for  the  first  and  CL2'*'  for  the  second, 
with  CL2  signifying  that  there  will  be  two  characters  in  the  quote  marks, 
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the  first  specifying  the  indirect  addressing  and  the  second  the 
character  blank. 

The  following  list  contains  the  use  of  the  names  in  the 
description  and  the  form  in  which  the  definition  is  given. 
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Alpha:    Character  specifying  comment,  given  in  the  form 

ALPHA  DC  C ' / ' 
Beta:     Starting  column  for  mnemonic,  given  in  the  form 

BETA  EQU  6 
Gamma:    Length  of  mnemonic,  given  in  the  form 

GAMMA  EQU  8 
Delta:    Starting  column  for  location  name,  given  in  the  form 

DELTA  EQU  1 
Epsilon:   Length  of  location  name,  given  in  the  form 

EPSILON  EQU  9 
Eta;      Starting  column  for  address,  given  in  the  form 

ETA  EQU  10 
Theta;    Length  of  address,  given  in  the  form 

THETA  EQU  73-ETA 
Iota;     Character  specifying  indirect  addressing,  given  in  the  form 

IOTA  DC  CL2 • I ' 
Kappas    Character  specifying  current  address,  given  in  the  form 

KAPPA  DC  C ' . * 
Lamda;    Length  of  operation  code  in  number  of  bits,  given  in  the  form 

LAMDA  EQU  3 
Mi;       Length  of  operation  code  in  number  of  bits,  given  in  the  form 

MI  EQU  12 
Pi:       Length  of  operation  code  in  case  of  microprogramming 
in  number  of  bits  t  given  in  the  form 

PI  EQU  12 
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8.   DIAGNOSTIC  MESSAGES 

When  the  ASSEMBLER  detects  an  error,  or  when  the  user 
should  be  notified  by  means  of  a  warning,  it  prints  diagnostic 
messages  to  help  the  programmer  correct  the  cause  of  error.   If  an 
error  is  detected  in  Pass  I  a  flag  is  set  so  that  the  assembly  will 
not  continue  to  Pass  II.   The  diagnostic  messages  and  their  meanings 
are  listed  below. 

'MNEMONIC  DOUBLE-DEFINED'.   After  printing  the  mnemonic 
this  message  appears  when  a  variable  is  given,  more  than  once,  as 
input  for  building  the  mnemonic  table.   It  is  not  a  critical  error, 
unless  the  user  defines  with  the  same  name  two  different  operation 
codes  so  the  flag  is  not  set  and  assembly  will  continue.   It  is 
simply  given  as  a  warning. 

'INVALID  MNEMONIC.   The  mnemonic  contains  an  invalid 
character.   The  flag  is  set. 

'NAME  UNDEFINED'.   During  the  evaluation  of  an  expression 
in  the  address  field,  a  name  was  encountered  which  was  not  defined 
in  the  program.   Note  that  names  in  some  pseudo  orders  must  be  predefined. 

'SHOULD  BE  MORE  ENTRIES  IN  THE  TABLE'.   This  message  is 
received  when  the  mnemonic  look-up  procedure  takes  place  and  tracing 
the  pointers  from  the  root  of  the  tree  down  to  the  branches  the  smallest 
entry  is  found  but  the  search  argument  is  smaller  or  the  largest  is  found 
and  the  search  argument  is  larger;  in  other  words  the  mnemonic  is  not 
found  in  the  mnemonic  table  built  by  the  programmer.   The  flag  is  set 
in  this  case. 
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'FIELD  EXCEEDS  LENGTH'.   This  message  is  printed  in  the 
case  that  a  name  exceeds  the  limits  of  the  location  field  (more  than 
eight  characters),  or  a  number  in  the  address  field  exceeds  the 
machine  limitations;  also,  if  a  name  in  the  address  field  is  too 
big  or  a  too  large  number  is  added  to  the  current  contents  of  the 
location  counter.   In  other  vords  this  same  message  is  printed  in 
all  cases  of  violation  of  length.   The  flag  is  set  in  all  cases 
except  in  the  case  that  the  scanning  of  the  address  field  takes  place 
only  during  Pass  II.   Then  the  assembly  will  continue  only  for  awhile, 
probably,  because  the  too  lengthy  names  will  not  be  found  in  the  table 
and  the  overflow  of  the  location  counter  will  be  caught  at  a  later 
point  in  the  program. 

'INVALID  CHARACTER  IN  FIELD'.   This  message  is  received 
when  an  invalid  character  is  found  in  a  field,  for  instance  in  the 
location  name,  or  the  first  character  of  a  name,  or  a  wrong  character 
in  a  number  in  the  address  field,  or  a  name  in  the  address  field,  or 
the  flag  is  set  in  the  case  that  the  error  is  discovered  during  Pass  I, 
else  something  similar  as  in  the  case  producing  the  previous  message, 
will  happen. 

'NAME  DOUBLE-DEFINED'.   A  name  in  the  location  field  is  used 
more  than  once.   In  the  case  of  a  twice  stored  as  entry  in  the  mnemonic 
table  this  error  was  not  critical,  but  in  this  case  the  flag  is  set 
and  assembly  will  not  continue  to  Pass  II. 

'OFF-PAGE  REFERENCE'.   The  value  of  the  address  field  of  a 
memory  referencing  instruction  is  neither  an  address  in  "page"  zero  nor 
an  address  in  the  current  "page" . 
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At  the  end  of  the  assembly  the  name  table  is  printed  in 
alphabetical  order  together  with  information  for  cross-reference, 
namely  length,  value,  and  definition  references. 
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